Predicting Subcellular Locations of Eukaryotic Proteins Using Bayesian and /k/-Nearest Neighbor Classifiers

نویسندگان

  • Han C. W. Hsiao
  • Shih-Hao Chen
  • Pei-Chun Chang
  • Jeffrey J. P. Tsai
چکیده

Biologically, the function of a protein is highly related to its subcellular location. It is of necessity to develop a reliable method for protein subcellular location prediction, especially when a large amount of proteins are to be analyzed. Various methods have been proposed to perform the task. The results, however, are not satisfactory in terms of effectiveness and efficiency. A hybrid approach combining naïve Bayesian classifier and k-nearest neighbor classifier is proposed to classify eukaryotic proteins represented as a combination of amino acid composition, dipeptide composition, and functional domain composition. Experimental results show that the total accuracy of a set of 17,655 proteins can reach up to 91.5%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of Tempromandibular Disorders Using Local Binary Patterns

Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment.Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal...

متن کامل

An Ensemble Classifier for Eukaryotic Protein Subcellular Location Prediction Using Gene Ontology Categories and Amino Acid Hydrophobicity

With the rapid increase of protein sequences in the post-genomic age, it is challenging to develop accurate and automated methods for reliably and quickly predicting their subcellular localizations. Till now, many efforts have been tried, but most of which used only a single algorithm. In this paper, we proposed an ensemble classifier of KNN (k-nearest neighbor) and SVM (support vector machine)...

متن کامل

Prediction of Subcellular Localization of Apoptosis Proteins by Dipeptide Composition

By cluster analysis, all dipeptides are classified into 16 categories according to their hydrophobicity, Based on the composition of dipeptide categories, a novel representation of protein sequences is proposed here to predict the subcellular location of apoptosis protein sequences. Using K-Nearest Neighbor Classifier, and test on a known dataset which includes 317 apoptosis proteins , the high...

متن کامل

Identification of selected monogeneans using image processing, artificial neural network and K-nearest neighbor

Abstract Over the last two decades, improvements in developing computational tools made significant contributions to the classification of biological specimens` images to their correspondence species. These days, identification of biological species is much easier for taxonomist and even non-taxonomists due to the development of automated computer techniques and systems.  In this study, we d...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Inf. Sci. Eng.

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2008